Data Exploration

plot all the relevant variables against each other

pairs(~ Number.of.states + Impairment + Tolerance + Speaker.entropy + Hearer.entropy + 
        Speaker.Voronoiness + Hearer.Voronoiness +
        Speaker.informativity + Hearer.informativity + Expected.utility + Iterations +
        Speaker.Convex.Cat, data=ds)

plot of chunk unnamed-chunk-2

some properties are very clearly positively correlated:

these two “clusters” are also strongly negatively correlated

dss = ds[c("Impairment", "Speaker.informativity", "Hearer.informativity", "Expected.utility",
           "Speaker.entropy")]

ctab <- cor(dss)

plotcorr(ctab)

plot of chunk unnamed-chunk-3

Replotting data for each combination of values of independent variables

We have three independent variables: impairment, tolerance & number of states. The following plots show the means of all 50 data points for each combination of values for independent variables, together with the estimated confidence intervals.

dms = summarySE(dm, groupvars = c("Number.of.states", "Impairment", "Tolerance", "variable"), measurevar="value")
dsub = dms

pd <- position_dodge(.01)
sp = ggplot(dsub, aes(x=Impairment, y=value, colour=factor(Number.of.states))) + 
     geom_point(position = pd) +
     geom_errorbar(aes(ymin=value-ci, ymax=value+ci), width=0, position=pd) +
     geom_line(position=pd) 

sp + facet_grid(variable ~ Tolerance, scales="free")

plot of chunk unnamed-chunk-4

Informativity

This is the notion from Skyrms. It seems very well-behaved. There is a clear positive correlation between speaker and hearer informativity.

cor.test(data$Speaker.informativity, data$Hearer.informativity)
## 
##  Pearson's product-moment correlation
## 
## data:  data$Speaker.informativity and data$Hearer.informativity
## t = 293.7, df = 3998, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.9762 0.9789
## sample estimates:
##    cor 
## 0.9776
qplot(data$Speaker.informativity, data$Hearer.informativity) + geom_smooth(method=lm)

plot of chunk unnamed-chunk-5

There is a clear negative correlation between impairment and informativity. The remaining variance here must and can be accounted for in terms of Number.of.states (see main plot).

cor.test(data$Impairment, data$Speaker.informativity)
## 
##  Pearson's product-moment correlation
## 
## data:  data$Impairment and data$Speaker.informativity
## t = -211.1, df = 3998, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.9604 -0.9553
## sample estimates:
##    cor 
## -0.958
qplot(data$Impairment, data$Speaker.informativity) + geom_smooth(method=lm)

plot of chunk unnamed-chunk-6

How much identity in outcomes?

It turns out that there is surprisingly little variance in the recorded values of dependent variables, at least for some triples of independent variables (tolerance, impairment, number of states). We can plot the difference between the maximmal and the minimal value of each property for each triplet of independent variables.

dm = melt(subset(ds,Iterations < 205)[,-which(names(ds) == "Iterations")],
          id.vars = c("Number.of.states", "Impairment", "Tolerance")) # melted data

MinMaxProperty = ddply(dm, .(Number.of.states, Impairment, Tolerance, variable), 
                      summarise, 
                      MaxMinDiff = max(value) - min(value),
                      MinProperty = min(value), MaxProperty = max(value), MeanProperty = mean(value))

pd <- position_dodge(.01)
sp = ggplot(MinMaxProperty, aes(x=Impairment, y=MaxMinDiff, colour=factor(Number.of.states))) + 
     geom_point(position = pd)

sp + facet_grid(variable ~ Tolerance, scales="free")

plot of chunk unnamed-chunk-7

We can then look at the maximal difference among all properties. (This works because all relevant dependent measures scale inbetween 0 and 1.)

diffTrials = ddply(MinMaxProperty,
                   .(Number.of.states, Impairment, Tolerance), 
                  summarise, 
                  diffTrials = max(MaxMinDiff) <=1e-5,
                  maxDiffTrials = max(MaxMinDiff) )

ggplot(diffTrials, aes(x=Impairment, y=maxDiffTrials, )) + 
     geom_point() + facet_grid(Number.of.states ~ Tolerance, scales="free")

plot of chunk unnamed-chunk-8

We see clearly that, all else equal (e.g. fixed tolerance and number of states), impairment ‘unifies’ evolutionary outcomes. When impairment is at least 0.1, outcomes are all almost identical (in terms of the recorded properties).

Notice that this ‘uniformity’ is different from just convexity vs. non-convexity!

d = subset(dsub, dsub$variable =="Speaker.Convex.Cat")

pd <- position_dodge(.01)
sp = ggplot(d, aes(x=Impairment, y=1-value )) + 
     geom_point(position = pd)

sp + facet_grid(Number.of.states ~ Tolerance, scales="free")

plot of chunk unnamed-chunk-9

There are strategies whose speaker strategies are all convex (e.g. for ns = 6, tolerance = 0.3 and impairment = 0.05), but whose other properties differ. Here are two strategies from that class of independent variable values that differ maximally in terms of expected utility.

d = subset(data, Number.of.states == 6 & Tolerance == 0.3 &
             Impairment == 0.05 & Speaker.Convex.Cat == 1)

show(plot.language.MinMaxProperty(d, which(names(d) == "Expected.utility")))
## [1] "Expected.utility"
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.896   0.896   1.000   0.971   1.000   1.000

plot of chunk unnamed-chunk-10

In contrast, if we look at a “uniform” class of independent parameter values, such as with ns = 50, tolerance = 0.1, impairment = 0.1, then the strategies whose EU are minimal and or maximal in that class don’t differ optically at all (modulo arbitrariness of message use).

d = subset(data, Number.of.states == 50 & Tolerance == 0.1 &
             Impairment == 0.1)

show(plot.language.MinMaxProperty(d, which(names(d) == "Expected.utility")))
## [1] "Expected.utility"
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.903   0.903   0.903   0.903   0.903   0.903

plot of chunk unnamed-chunk-11

What this suggests is that the differences in some of the groups might be due to non-evenly splitting the interval. Impairment leads to even splits and therefore to uniformity of evolutionary outcomes.